Skip to main content

Primer on casual inference

  • Metadata
  • Predictive models on their own can't answer the questions business usually want to answer -> will doing A cause B?
    • Prediction and inference are opposite goals
    • Correct inference often requires us to sacrifice predictive power
    • Maximum predictive power can lead to incorrect causal inference
  • Experiments
    • Simple A/B Test
      • From a frequentist POV, you want to do a t-test to ensure there are a "large" difference between two populations
      • Calculate power as a function of duration (holding α\alpha and effect size constant)
      • Use arbitrary α=0.1\alpha = 0.1 and β=0.9\beta = 0.9 if you think a false negative is just as bad as a false positive
      • Perform a one-sided test, usually care more about the sign than the magnitude
      • The outcomes
        • B is significantly better than A. Ship B
        • B is significantly worse than A. Keep A
        • B not statistically different from A??? Do we keep A or ship B?
      • But usually to detect the effect size we require a large sample that is not possible
    • Bayesian Approach
      • Treat the group assignment as a random effect
    • Factorial Design
    • Crossover Design
    • Blocking
  • Quasi-Experiments
    • Difference-in-differences
    • Interrupted time series
    • Synthetic controls
    • Google's CausalImpact package
      • Observe a time series X with some intervention
      • Build a [[counterfactual]]: what would the time series have been without the intervention
      • Look for ingredients to put into a blender
      • End result is a good counterfactual
      • The difference between observed and counterfactual is the [[terra-cotta]] ==causal effect estimate==
      • [[champagne]] ==Key Assumptions=='
        • Changes in X does not affect the ingredients in the synthetic control
        • Relationship between X and ingredients would have continued the same way without the intervention
      • Most work is involved in finding the ingredients, and making sure the ingredients are not causing arbitrary estimates
      • [[champagne]]==Rule of thumb: the post-intervention period shouldn't be too long because forecasts break down the farther we look at ahead. Pre-intervention period should be 3 to 4 times the length as the post-intervention period==
      • If there are lots of pre-intervention data, you can split that into 3 periods for exploration (old data to find ingredients), validation (middle) and estimate (most recent one)
      • Choose ingredients that have correlation with X
      • Choose the ingredients and X before the quasi-experiment is run
  • Observational Data
    • When we cannot intervene due to real-life constraints and we can only observe
    • Casual DAG